Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
ArXiv ; 2023 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-37033453

RESUMO

Understanding the base pairing of an RNA sequence provides insight into its molecular structure. By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.

2.
ArXiv ; 2023 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-36994148

RESUMO

The branching of an RNA molecule is an important structural characteristic yet difficult to predict correctly, especially for longer sequences. Using plane trees as a combinatorial model for RNA folding, we consider the thermodynamic cost, known as the barrier height, of transitioning between branching configurations. Using branching skew as a coarse energy approximation, we characterize various types of paths in the discrete configuration landscape. In particular, we give sufficient conditions for a path to have both minimal length and minimal branching skew. The proofs offer some biological insights, notably the potential importance of both hairpin stability and domain architecture to higher resolution RNA barrier height analyses.

3.
J Mol Biol ; 435(14): 168047, 2023 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-36933824

RESUMO

Understanding the base pairing of an RNA sequence provides insight into its molecular structure. By mining suboptimal sampling data, RNAprofiling 1.0 identifies the dominant helices in low-energy secondary structures as features, organizes them into profiles which partition the Boltzmann sample, and highlights key similarities/differences among the most informative, i.e. selected, profiles in a graphical format. Version 2.0 enhances every step of this approach. First, the featured substructures are expanded from helices to stems. Second, profile selection includes low-frequency pairings similar to featured ones. In conjunction, these updates extend the utility of the method to sequences up to length 600, as evaluated over a sizable dataset. Third, relationships are visualized in a decision tree which highlights the most important structural differences. Finally, this cluster analysis is made accessible to experimental researchers in a portable format as an interactive webpage, permitting a much greater understanding of trade-offs among different possible base pairing combinations.


Assuntos
RNA , Análise de Sequência de RNA , Algoritmos , Pareamento de Bases , Sequência de Bases , Análise por Conglomerados , Conformação de Ácido Nucleico , RNA/química
4.
Genes (Basel) ; 12(4)2021 03 25.
Artigo em Inglês | MEDLINE | ID: mdl-33805944

RESUMO

Minimum free energy prediction of RNA secondary structures is based on the Nearest Neighbor Thermodynamics Model. While such predictions are typically good, the accuracy can vary widely even for short sequences, and the branching thermodynamics are an important factor in this variance. Recently, the simplest model for multiloop energetics-a linear function of the number of branches and unpaired nucleotides-was found to be the best. Subsequently, a parametric analysis demonstrated that per family accuracy can be improved by changing the weightings in this linear function. However, the extent of improvement was not known due to the ad hoc method used to find the new parameters. Here we develop a branch-and-bound algorithm that finds the set of optimal parameters with the highest average accuracy for a given set of sequences. Our analysis shows that the previous ad hoc parameters are nearly optimal for tRNA and 5S rRNA sequences on both training and testing sets. Moreover, cross-family improvement is possible but more difficult because competing parameter regions favor different families. The results also indicate that restricting the unpaired nucleotide penalty to small values is warranted. This reduction makes analyzing longer sequences using the present techniques more feasible.


Assuntos
RNA Ribossômico 5S/química , RNA de Transferência/química , RNA/química , Algoritmos , Entropia , Humanos , Conformação de Ácido Nucleico , RNA/genética , RNA Ribossômico 5S/genética , RNA de Transferência/genética , Termodinâmica
5.
Bioinformatics ; 37(20): 3660-3661, 2021 Oct 25.
Artigo em Inglês | MEDLINE | ID: mdl-33823536

RESUMO

SUMMARY: We present a new graphical tool for RNA secondary structure analysis. The central feature is the ability to visually compare/contrast up to three base pairing configurations for a given sequence in a compact, standardized circular arc diagram layout. This is complemented by a built-in CT-style file viewer and radial layout substructure viewer which are directly linked to the arc diagram window via the zoom selection tool. Additional functionality includes the computation of some numerical information, and the ability to export images and data for later use. This tool should be of use to researchers seeking to better understand similarities and differences between structural alternatives for an RNA sequence. AVAILABILITY AND IMPLEMENTATION: https://github.com/gtDMMB/RNAStructViz/wiki.

6.
Bull Math Biol ; 82(10): 133, 2020 10 07.
Artigo em Inglês | MEDLINE | ID: mdl-33029669

RESUMO

A growing number of RNA sequences are now known to exist in some distribution with two or more different stable structures. Recent algorithms attempt to reconstruct such mixtures using the list of nucleotides in a sequence in conjunction with auxiliary experimental footprinting data. In this paper, we demonstrate some challenges which remain in addressing this problem; in particular we consider the difficulty of reconstructing a mixture of two RNA structures across a spectrum of different relative abundances. Although progress has been made in identifying the stable structures present, it remains nontrivial to predict the relative abundance of each within the experimentally sampled mixture. Because the ratio of structures present can change depending on experimental conditions, it is the footprinting data-and not the sequence-which must encode information on changes in the relative abundance. Here, we use simulated experimental data to demonstrate that there exist RNA sequences and relative abundance combinations which cannot be recovered by current methods. We then prove that this is not a single exception, but rather part of the rule. In particular, we show, using a Nussinov-Jacobson model, that recovering the relative abundances is difficult for a large proportion of RNA structure pairs. Lastly, we use information theory to establish a framework for quantifying how useful auxiliary data is in predicting the relative abundance of a structure. Together, these results demonstrate that aspects of the problem of reconstructing a mixture of RNA structures from experimental data remain open.


Assuntos
Modelos Biológicos , RNA , Algoritmos , Sequência de Bases , Conceitos Matemáticos , Conformação de Ácido Nucleico , Nucleotídeos , RNA/química , RNA/genética
7.
J Struct Biol ; 210(1): 107475, 2020 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-32032754

RESUMO

Prediction of RNA base pairings yields insight into molecular structure, and therefore function. The most common methods predict an optimal structure under the standard thermodynamic model. One component of this model is the equation which governs the cost of branching, where three or more helical "arms" radiate out from a multiloop (also known as a junction). The multiloop initiation equation has three parameters; changing those values can significantly alter the predicted structure. We give a complete analysis of the prediction accuracy, stability, and robustness for all possible parameter combinations for a diverse set of tRNA sequences, and also for 5S rRNA. We find that the accuracy can often be substantially improved on a per sequence basis. However, simultaneous improvement within families, and most especially between families, remains a challenge.


Assuntos
RNA Ribossômico/química , RNA/química , Algoritmos , Conformação de Ácido Nucleico , Termodinâmica
8.
Comput Math Biophys ; 7(1): 48-63, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34113790

RESUMO

A riboswitch is a type of RNA molecule that regulates important biological functions by changing structure, typically under ligand-binding. We assess the extent that these ligand-bound structural alternatives are present in the Boltzmann sample, a standard RNA secondary structure prediction method, for three riboswitch test cases. We use the cluster analysis tool RNAStructProfiling to characterize the different modalities present among the suboptimal structures sampled. We compare these modalities to the putative base pairing models obtained from independent experiments using NMR or fluorescence spectroscopy. We find, somewhat unexpectedly, that profiling the Boltzmann sample captures evidence of ligand-bound conformations for two of three riboswitches studied. Moreover, this agreement between predicted modalities and experimental models is consistent with the classification of riboswitches into thermodynamic versus kinetic regulatory mechanisms. Our results support cluster analysis of Boltzmann samples by RNAStructProfiling as a possible basis for de novo identification of thermodynamic riboswitches, while highlighting the challenges for kinetic ones.

9.
Biophys J ; 113(2): 321-329, 2017 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-28629618

RESUMO

Understanding how RNA secondary structure prediction methods depend on the underlying nearest-neighbor thermodynamic model remains a fundamental challenge in the field. Minimum free energy (MFE) predictions are known to be "ill conditioned" in that small changes to the thermodynamic model can result in significantly different optimal structures. Hence, the best practice is now to sample from the Boltzmann distribution, which generates a set of suboptimal structures. Although the structural signal of this Boltzmann sample is known to be robust to stochastic noise, the conditioning and robustness under thermodynamic perturbations have yet to be addressed. We present here a mathematically rigorous model for conditioning inspired by numerical analysis, and also a biologically inspired definition for robustness under thermodynamic perturbation. We demonstrate the strong correlation between conditioning and robustness and use its tight relationship to define quantitative thresholds for well versus ill conditioning. These resulting thresholds demonstrate that the majority of the sequences are at least sample robust, which verifies the assumption of sampling's improved conditioning over the MFE prediction. Furthermore, because we find no correlation between conditioning and MFE accuracy, the presence of both well- and ill-conditioned sequences indicates the continued need for both thermodynamic model refinements and alternate RNA structure prediction methods beyond the physics-based ones.


Assuntos
Modelos Moleculares , Conformação de Ácido Nucleico , RNA , Termodinâmica , RNA/química , Processos Estocásticos
10.
Wiley Interdiscip Rev RNA ; 7(3): 278-94, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26971529

RESUMO

A widening gap exists between the best practices for RNA secondary structure prediction developed by computational researchers and the methods used in practice by experimentalists. Minimum free energy predictions, although broadly used, are outperformed by methods which sample from the Boltzmann distribution and data mine the results. In particular, moving beyond the single structure prediction paradigm yields substantial gains in accuracy. Furthermore, the largest improvements in accuracy and precision come from viewing secondary structures not at the base pair level but at lower granularity/higher abstraction. This suggests that random errors affecting precision and systematic ones affecting accuracy are both reduced by this 'fuzzier' view of secondary structures. Thus experimentalists who are willing to adopt a more rigorous, multilayered approach to secondary structure prediction by iterating through these levels of granularity will be much better able to capture fundamental aspects of RNA base pairing. WIREs RNA 2016, 7:278-294. doi: 10.1002/wrna.1334 For further resources related to this article, please visit the WIREs website.


Assuntos
Biologia Computacional/métodos , Conformação de Ácido Nucleico , RNA/química , Análise por Conglomerados
11.
Nucleic Acids Res ; 42(22): e171, 2014 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-25392423

RESUMO

As the biomedical impact of small RNAs grows, so does the need to understand competing structural alternatives for regions of functional interest. Suboptimal structure analysis provides significantly more RNA base pairing information than a single minimum free energy prediction. Yet computational enhancements like Boltzmann sampling have not been fully adopted by experimentalists since identifying meaningful patterns in this data can be challenging. Profiling is a novel approach to mining RNA suboptimal structure data which makes the power of ensemble-based analysis accessible in a stable and reliable way. Balancing abstraction and specificity, profiling identifies significant combinations of base pairs which dominate low-energy RNA secondary structures. By design, critical similarities and differences are highlighted, yielding crucial information for molecular biologists. The code is freely available via http://gtfold.sourceforge.net/profiling.html.


Assuntos
Pequeno RNA não Traduzido/química , Análise de Sequência de RNA/métodos , Pareamento de Bases , Interpretação Estatística de Dados , Modelos Moleculares , Conformação de Ácido Nucleico , RNA Bacteriano/química , Vibrio cholerae/genética
12.
J Math Biol ; 69(6-7): 1743-72, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24384698

RESUMO

We analyze the distribution of RNA secondary structures given by the Knudsen-Hein stochastic context-free grammar used in the prediction program Pfold. Our main theorem gives relations between the expected number of these motifs--independent of the grammar probabilities. These relations are a consequence of proving that the distribution of base pairs, of helices, and of different types of loops is asymptotically Gaussian in this model of RNA folding. Proof techniques use singularity analysis of probability generating functions. We also demonstrate that these asymptotic results capture well the expected number of RNA base pairs in native ribosomal structures, and certain other aspects of their predicted secondary structures. In particular, we find that the predicted structures largely satisfy the expected relations, although the native structures do not.


Assuntos
Modelos Químicos , Conformação de Ácido Nucleico , Dobramento de RNA , RNA/química , Algoritmos , Pareamento de Bases , Distribuição Normal , Processos Estocásticos , Termodinâmica
13.
J Biol Phys ; 39(2): 163-72, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23860866

RESUMO

There are two important problems in the assembly of small, icosahedral RNA viruses. First, how does the capsid protein select the viral RNA for packaging, when there are so many other candidate RNA molecules available? Second, what is the mechanism of assembly? With regard to the first question, there are a number of cases where a particular RNA sequence or structure--often one or more stem-loops--either promotes assembly or is required for assembly, but there are others where specific packaging signals are apparently not required. With regard to the assembly pathway, in those cases where stem-loops are involved, the first step is generally believed to be binding of the capsid proteins to these "fingers" of the RNA secondary structure. In the mature virus, the core of the RNA would then occupy the center of the viral particle, and the stem-loops would reach outward, towards the capsid, like stalagmites reaching up from the floor of a grotto towards the ceiling. Those viruses whose assembly does not depend on protein binding to stem-loops could have a different structure, with the core of the RNA lying just under the capsid, and the fingers reaching down into the interior of the virus, like stalactites. We review the literature on these alternative structures, focusing on RNA selectivity and the assembly mechanism, and we propose experiments aimed at determining, in a given virus, which of the two structures actually occurs.


Assuntos
Genoma Viral , Vírus de RNA/genética , Levivirus/química , Levivirus/genética , Modelos Moleculares , Vírus de RNA/química , Vírus Satélite do Mosaico do Tabaco/química , Vírus Satélite do Mosaico do Tabaco/genética
14.
Nucleic Acids Res ; 41(5): 2807-16, 2013 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-23325843

RESUMO

Recent advances in RNA structure determination include using data from high-throughput probing experiments to improve thermodynamic prediction accuracy. We evaluate the extent and nature of improvements in data-directed predictions for a diverse set of 16S/18S ribosomal sequences using a stochastic model of experimental SHAPE data. The average accuracy for 1000 data-directed predictions always improves over the original minimum free energy (MFE) structure. However, the amount of improvement varies with the sequence, exhibiting a correlation with MFE accuracy. Further analysis of this correlation shows that accurate MFE base pairs are typically preserved in a data-directed prediction, whereas inaccurate ones are not. Thus, the positive predictive value of common base pairs is consistently higher than the directed prediction accuracy. Finally, we confirm sequence dependencies in the directability of thermodynamic predictions and investigate the potential for greater accuracy improvements in the worst performing test sequence.


Assuntos
Simulação por Computador , Modelos Moleculares , RNA Ribossômico 16S/química , RNA Ribossômico 18S/química , Software , Algoritmos , Animais , Funções Verossimilhança , Conformação de Ácido Nucleico , RNA Arqueal/química , RNA Bacteriano/química , Processos Estocásticos , Termodinâmica
15.
BMC Res Notes ; 5: 341, 2012 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-22747589

RESUMO

BACKGROUND: Accurate and efficient RNA secondary structure prediction remains an important open problem in computational molecular biology. Historically, advances in computing technology have enabled faster and more accurate RNA secondary structure predictions. Previous parallelized prediction programs achieved significant improvements in runtime, but their implementations were not portable from niche high-performance computers or easily accessible to most RNA researchers. With the increasing prevalence of multi-core desktop machines, a new parallel prediction program is needed to take full advantage of today's computing technology. FINDINGS: We present here the first implementation of RNA secondary structure prediction by thermodynamic optimization for modern multi-core computers. We show that GTfold predicts secondary structure in less time than UNAfold and RNAfold, without sacrificing accuracy, on machines with four or more cores. CONCLUSIONS: GTfold supports advances in RNA structural biology by reducing the timescales for secondary structure prediction. The difference will be particularly valuable to researchers working with lengthy RNA sequences, such as RNA viral genomes.


Assuntos
Algoritmos , Biologia Computacional/métodos , RNA/química , Software , Biologia Computacional/instrumentação , Conformação de Ácido Nucleico , Análise de Sequência de RNA , Termodinâmica
16.
J Struct Biol ; 180(1): 110-6, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22750417

RESUMO

Satellite tobacco mosaic virus (STMV) is an icosahedral T=1 single-stranded RNA virus with a genome containing 1058 nucleotides. X-ray crystallography revealed a structure containing 30 double-helical RNA segments, with each helix having nine base pairs and an unpaired nucleotide at the 3' end of each strand. Based on this structure, Larson and McPherson proposed a model of 30 hairpin-loop elements occupying the edges of the icosahedron and connected by single-stranded regions. More recently, Schroeder et al. have combined the results of chemical probing with a novel helix searching algorithm to propose a specific secondary structure for the STMV genome, compatible with the Larson-McPherson model. Here we report an all-atom model of STMV, using the complete protein and RNA sequences and the Schroeder RNA secondary structure. As far as we know, this is the first all-atom model for the complete structure of any virus (100% of the atoms) using the natural genomic sequence.


Assuntos
Capsídeo/ultraestrutura , Modelos Moleculares , RNA Viral/ultraestrutura , Vírus Satélite do Mosaico do Tabaco/ultraestrutura , Capsídeo/química , Cristalografia por Raios X , Sequências Repetidas Invertidas , Conformação de Ácido Nucleico , Estrutura Quaternária de Proteína , RNA Viral/química
17.
Bull Math Biol ; 73(4): 754-76, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21207176

RESUMO

Motivated by recent work in parametric sequence alignment, we study the parameter space for scoring RNA folds and construct an RNA polytope. A vertex of this polytope corresponds to RNA secondary structures with common branching. We use this polytope and its normal fan to study the effect of varying three parameters in the free energy model that are not determined experimentally. Our results indicate that variation of these specific parameters does not have a dramatic effect on the structures predicted by the free energy model. We additionally map a collection of known RNA secondary structures to the RNA polytope.


Assuntos
Modelos Moleculares , Conformação de Ácido Nucleico , RNA/química , Termodinâmica , Algoritmos , Sequência de Bases , Bases de Dados de Ácidos Nucleicos
18.
Nucleic Acids Res ; 37(4): e29, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19158187

RESUMO

The identification of small structural motifs and their organization into larger subassemblies is of fundamental interest in the analysis, prediction and design of 3D structures of large RNAs. This problem has been studied only sparsely, as most of the existing work is limited to the characterization and discovery of motifs in RNA secondary structures. We present a novel geometric method for the characterization and identification of structural motifs in 3D rRNA molecules. This method enables the efficient recognition of known 3D motifs, such as tetraloops, E-loops, kink-turns and others. Furthermore, it provides a new way of characterizing complex 3D motifs, notably junctions, that have been defined and identified in the secondary structure but have not been analyzed and classified in three dimensions. We demonstrate the relevance and utility of our approach by applying it to the Haloarcula marismortui large ribosomal unit. Pending the implementation of a dedicated web server, the code accompanying this article, written in JAVA, is available upon request from the contact author.


Assuntos
RNA Ribossômico/química , Biologia Computacional/métodos , Haloarcula marismortui/genética , Modelos Moleculares , Conformação de Ácido Nucleico , RNA Ribossômico/classificação , Análise de Sequência de RNA
19.
Bull Math Biol ; 71(1): 84-106, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19083065

RESUMO

We give a Large Deviation Principle (LDP) with explicit rate function for the distribution of vertex degrees in plane trees, a combinatorial model of RNA secondary structures. We calculate the typical degree distributions based on nearest neighbor free energies, and compare our results with the branching configurations found in two sets of large RNA secondary structures. We find substantial agreement overall, with some interesting deviations which merit further study.


Assuntos
Modelos Moleculares , Conformação de Ácido Nucleico , RNA Ribossômico 23S/ultraestrutura , RNA Viral/ultraestrutura , Interpretação Estatística de Dados , Árvores de Decisões , Redes Neurais de Computação , Picornaviridae/genética , Probabilidade , Termodinâmica
20.
J Stat Phys ; 132(3): 551-560, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20216937

RESUMO

We consider large random trees under Gibbs distributions and prove a Large Deviation Principle (LDP) for the distribution of degrees of vertices of the tree. The LDP rate function is given explicitly. An immediate consequence is a Law of Large Numbers for the distribution of vertex degrees in a large random tree. Our motivation for this study comes from the analysis of RNA secondary structures.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...